Development of a Translation Memory System for Turkish to English
نویسنده
چکیده
DEVELOPMENT OF A TRANSLATION MEMORY SYSTEM FOR TURKISH TO ENGLISH This thesis studies on translation memory systems and development of a translation memory system on Turkish to English. A translation memory system is a tool that is designed for helping human translators during translation. It uses a database that consists of text parts such as blocks, paragraphs, sentences, or phrases in one language that is called source language and their translations in another language which is called target language. The translation memory system searches its database to find the closest sentence for the source sentence that has to be translated. Translation memory systems that are used by human translators offer the best matches for the review of the translator. The translator either accepts the result or rejects it, or makes some modifications, or asks for the second best result. This thesis presents a translation memory system based on both orthographic and semantic knowledge. Translation memory system concept is not popular for Turkish. To the best of our knowledge, there is not a translation memory system developed for Turkish to English. This is the first translation memory system on Turkish to English. The proposed system uses a sentence level memory but exploits words and various linguistic features of the language. The similarity search algorithm takes advantage of highly agglutinative word structures, inflectional and derivational affixes of Turkish. The presented framework considers orthographic, morphologic, lexical, semantic and syntactical features. It gathers good success rate on Turkish (about 0.40 BLEU score) and is expected to be helpful on the translation of languages that have similar linguistic structure with Turkish.
منابع مشابه
A Contrastive Study of Theme in English and Azerbaijani Turkish Fictional Texts
Thematisationis one of the troublesome areas both for translation purposes from or into English and also for learning EFL. The main reason for the problem lies in the fact that usually different languages structure thematisation in different ways. Therefore, the present research is an attempt to investigate contrastively: experiential (topical), interpersonal and textual themes in a sample of A...
متن کاملThe GREYC translation memory for the IWSLT 2009 evaluation campaign: one step beyond translation memory
This year’s GREYC translation system is an improved translation memory that was designed from scratch to experiment with an approach whose goal is just to improve over the output of a standard translation memory by making heavy use of sub-sentential alignments in a restricted case of translation by analogy. The tracks the system participated in are all BTEC tracks: Arabic to English, Chinese to...
متن کاملAn English-to-Turkish Interlingual MT System
This paper describes the integration of a Turkish generation system with the KANT knowledge-based machine translation system to produce a prototype English–Turkish interlingua-based machine translation system. These two independently constructed systems were successfully integrated within a period of two months, through development of a module which maps KANT interlingua expressions to Turkish ...
متن کاملAn English to Turkish Machine Translation System Using Structural Mapping
This paper describes the design and implementat ion of an English-Turkish machine translation (MT) system developed as a part of the TU-Language project supported by a NATO Science for Stability Project grant. The system uses a structural transfer approach in translating the domain of IBM computer manuals. The general design of the translation system and a detailed description of the transfer c...
متن کاملAligning Turkish and English Parallel Texts for Statistical Machine Translation
This paper presents a preliminary work on aligning Turkish and English parallel texts towards developing a statistical machine translation system for English and Turkish. To avoid the data sparseness problem and to uncover relations between sublexical components of words such as morphemes, we have converted our parallel texts to a morphemic representation and then used standard word alignment a...
متن کامل